AITopics | successive halving

Collaborating Authors

successive halving

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Successive Halving with Learning Curve Prediction via Latent Kronecker Gaussian Processes

Lin, Jihao Andreas, Mayoraz, Nicolas, Rendle, Steffen, Kuzmin, Dima, Praun, Emil, Isik, Berivan

arXiv.org Artificial IntelligenceAug-21-2025

Successive Halving is a popular algorithm for hyperparameter optimization which allocates exponentially more resources to promising candidates. However, the algorithm typically relies on intermediate performance values to make resource allocation decisions, which can cause it to prematurely prune slow starters that would eventually become the best candidate. We investigate whether guiding Successive Halving with learning curve predictions based on Latent Kronecker Gaussian Processes can overcome this limitation. In a large-scale empirical study involving different neural network architectures and a click prediction dataset, we compare this predictive approach to the standard approach based on current performance values. Our experiments show that, although the predictive approach achieves competitive performance, it is not Pareto optimal compared to investing more resources into the standard approach, because it requires fully observed learning curves as training data. However, this downside could be mitigated by leveraging existing learning curve data.

artificial intelligence, machine learning, performance value, (14 more...)

arXiv.org Artificial Intelligence

2508.14818

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

FlexHB: a More Efficient and Flexible Framework for Hyperparameter Optimization

Zhang, Yang, Wu, Haiyang, Yang, Yuekui

arXiv.org Artificial IntelligenceFeb-21-2024

Given a Hyperparameter Optimization(HPO) problem, how to design an algorithm to find optimal configurations efficiently? Bayesian Optimization(BO) and the multi-fidelity BO methods employ surrogate models to sample configurations based on history evaluations. More recent studies obtain better performance by integrating BO with HyperBand(HB), which accelerates evaluation by early stopping mechanism. However, these methods ignore the advantage of a suitable evaluation scheme over the default HyperBand, and the capability of BO is still constrained by skewed evaluation results. In this paper, we propose FlexHB, a new method pushing multi-fidelity BO to the limit as well as re-designing a framework for early stopping with Successive Halving(SH). Comprehensive study on FlexHB shows that (1) our fine-grained fidelity method considerably enhances the efficiency of searching optimal configurations, (2) our FlexBand framework (self-adaptive allocation of SH brackets, and global ranking of configurations in both current and past SH procedures) grants the algorithm with more flexibility and improves the anytime performance. Our method achieves superior efficiency and outperforms other methods on various HPO tasks. Empirical results demonstrate that FlexHB can achieve up to 6.9X and 11.1X speedups over the state-of-the-art MFES-HB and BOHB respectively.

configuration, fidelity level, hyperband, (14 more...)

arXiv.org Artificial Intelligence

2402.13641

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > Spain > Andalusia > Cádiz Province > Cadiz (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Iterative Deepening Hyperband

Brandt, Jasmin, Wever, Marcel, Iliadis, Dimitrios, Bengs, Viktor, Hüllermeier, Eyke

arXiv.org Artificial IntelligenceFeb-6-2023

Hyperparameter optimization (HPO) is concerned with the automated search for the most appropriate hyperparameter configuration (HPC) of a parameterized machine learning algorithm. A state-of-the-art HPO method is Hyperband, which, however, has its own parameters that influence its performance. One of these parameters, the maximal budget, is especially problematic: If chosen too small, the budget needs to be increased in hindsight and, as Hyperband is not incremental by design, the entire algorithm must be re-run. This is not only costly but also comes with a loss of valuable knowledge already accumulated. In this paper, we propose incremental variants of Hyperband that eliminate these drawbacks, and show that these variants satisfy theoretical guarantees qualitatively similar to those for the original Hyperband with the "right" budget. Moreover, we demonstrate their practical utility in experiments with benchmark data sets.

artificial intelligence, configuration, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.00511

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)

Add feedback

A resource-efficient method for repeated HPO and NAS problems

Zappella, Giovanni, Salinas, David, Archambeau, Cédric

arXiv.org Artificial IntelligenceMar-30-2021

In this work we consider the problem of repeated hyperparameter and neural architecture search (HNAS).We propose an extension of Successive Halving that is able to leverage information gained in previous HNAS problems with the goal of saving computational resources. We empirically demonstrate that our solution is able to drastically decrease costs while maintaining accuracy and being robust to negative transfer. Our method is significantly simpler than competing transfer learning approaches, setting a new baseline for transfer learning in HNAS. Creating predictive models requires data scientists to delve into data sources, understand and visualize the raw data, apply multiple data transformations and pick a target metric. Searching deep learning architecture and optimization the hyperparameters are often left as a manual step to be performed "from time to time" in practice. However, best practice dictates that reusing historical architectures and hyperparameters under different experimental conditions can negatively impact the predictive performance.

algorithm, configuration, dataset, (16 more...)

arXiv.org Artificial Intelligence

2103.16111

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.54)

Add feedback

Reducing The Search Space For Hyperparameter Optimization Using Group Sparsity

Cho, Minsu, Hegde, Chinmay

arXiv.org Machine LearningApr-24-2019

We propose a new algorithm for hyperparameter selection in machine learning algorithms. The algorithm is a novel modification of Harmonica, a spectral hyperparameter selection approach using sparse recovery methods. In particular, we show that a special encoding of hyperparameter space enables a natural group-sparse recovery formulation, which when coupled with HyperBand (a multi-armed bandit strategy) leads to improvement over existing hyperparameter optimization methods such as Successive Halving and Random Search. Experimental results on image datasets such as CIFAR-10 confirm the benefits of our approach.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

1904.11095

Country: North America > United States > Iowa > Story County > Ames (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Metaoptimization on a Distributed System for Deep Reinforcement Learning

Heinrich, Greg, Frosio, Iuri

arXiv.org Machine LearningFeb-7-2019

Training intelligent agents through reinforcement learning is a notoriously unstable procedure. Massive parallelization on GPUs and distributed systems has been exploited to generate a large amount of training experiences and consequently reduce instabilities, but the success of training remains strongly influenced by the choice of the hyperparameters. To overcome this issue, we introduce HyperTrick, a new metaoptimization algorithm, and show its effective application to tune hyperparameters in the case of deep reinforcement learning, while learning to play different Atari games on a distributed system. Our analysis provides evidence of the interaction between the identification of the optimal hyperparameters and the learned policy, that is typical of the case of metaoptimization for deep reinforcement learning. When compared with state-of-the-art metaoptimization algorithms, HyperTrick is characterized by a simpler implementation and it allows learning similar policies, while making a more effective use of the computational resources in a distributed system.

hyperparameter, hypertrick, successive halving, (14 more...)

arXiv.org Machine Learning

1902.02725

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Pure-Exploration for Infinite-Armed Bandits with General Arm Reservoirs

Aziz, Maryam, Jamieson, Kevin, Aslam, Javed

arXiv.org Machine LearningNov-14-2018

This paper considers a multi-armed bandit game where the number of arms is much larger than the maximum budget and is effectively infinite. We characterize necessary and sufficient conditions on the total budget for an algorithm to return an {\epsilon}-good arm with probability at least 1 - {\delta}. In such situations, the sample complexity depends on {\epsilon}, {\delta} and the so-called reservoir distribution {\nu} from which the means of the arms are drawn iid. While a substantial literature has developed around analyzing specific cases of {\nu} such as the beta distribution, our analysis makes no assumption about the form of {\nu}. Our algorithm is based on successive halving with the surprising exception that arms start to be discarded after just a single pull, requiring an analysis that goes beyond concentration alone. The provable correctness of this algorithm also provides an explanation for the empirical observation that the most aggressive bracket of the Hyperband algorithm of Li et al. (2017) for hyperparameter tuning is almost always best.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

1811.06149

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Data Science > Data Mining > Big Data (0.68)

Add feedback